-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[Test] Wait for clsuter to form before assertions #130973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Nodes that are not publish quorum may not have applied the latest update when awaitMasterNode returns. This PR fixes it by waiting for the desired number of nodes. Relates: elastic#129118 Resolves: elastic#130883
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| internalCluster().stopCurrentMasterNode(); | ||
| awaitMasterNode(); | ||
| assertNotEquals(originalMaster, internalCluster().getMasterName()); | ||
| ensureStableCluster(2 + numDataNodes); // wait for all nodes to join |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This weakens the test slightly more than I'd like. The trouble here is in fact that these tests run with autoManageMasterNodes = false so they skip over the call to validateClusterFormed() when starting nodes here:
elasticsearch/test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java
Lines 2252 to 2254 in ecbc360
| if (autoManageMasterNodes) { | |
| validateClusterFormed(); | |
| } |
That makes sense in general because if we're not auto-bootstrapping the test cluster we cannot expect it to be fully-formed after each node starts. But in these tests we can expect the cluster to be fully formed after we've finished starting all the nodes, so we should call that method explicitly ourselves there. Importantly, I think we should do that before stopping the original master node, since the new master node should not spuriously drop any of the other nodes from the cluster when it takes over.
I think I'd like us to rework these tests slightly so that rather than an assertBusy() on the current state having the right size of configuration we use ESIntegTestCase#awaitClusterState(Predicate<ClusterState>) to wait for a state which has the right voting configuration and the right number of nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated as suggested in ce1452c
Please let me know if it looks right to you. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update also includes fix for a new but similar nature failure #130979
|
|
||
| /** | ||
| * Waits for all nodes in the cluster to have a consistent view of which node is currently the master. | ||
| * Waits for all nodes forming the publish quorum in the cluster to have a consistent view of which node is currently the master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't correct. A publish quorum is some subset of the master-eligible nodes in the cluster (e.g. 2 of the 3 masters) which is much weaker than what this does. Here we're waiting for all the nodes in the cluster (i.e. in ClusterState#nodes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I meant the nodes could still be in a different cluster state version after this method is returned, is this true? But you are right that the nodes should all have the same master. The trouble here is that there could still be nodes trying to join the cluster when this returns. In that case, it does not guarantee the "all nodes", which is more than ClusterState#nodes, to see the same master. Technically it is the current behaviour, but kindas defeats its purpose in such situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setWaitForEvents(Priority.LANGUID) means that there was a point in time during the execution of this method where there was a committed cluster state and all the nodes in that cluster state were on the same cluster state version (or stuff was timing out, but that doesn't happen in these tests). It can't guarantee that another publication hasn't started before the method returns ofc.
You're right that there may also be nodes still trying to join the cluster at this point, but they're not in the cluster (they haven't joined it yet). In practice, it's up to the caller to account for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted changes to the comment in 5445ef6. We can update it separately (if at all). Thanks!
|
|
||
| internalCluster().stopCurrentMasterNode(); | ||
| awaitMasterNode(); | ||
| internalCluster().validateClusterFormed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather this remained as just an awaitMasterNode(). That should be sufficient here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is theoretically possible that the data node fails to join the new master so that awaitMasterNode() returns without seeing the data node. If we later ask the data node (via a randomed client) about its master and it could fail with NPE. That said, it is not a practical concern for this test. So I reverted back to use awaitMasterNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
theoretically possible that the data node fails to join the new master
No, that shouldn't be possible (i.e. I'd want the test to fail if that happens). The new master will start from the state that the old master last committed, which will include the data node, so the data node will automatically be part of the new master's cluster. The new master would only remove the data node from its cluster if something positively fails (e.g. network disconnect).
| equalTo(false) | ||
| ); | ||
| final ClusterState state = clusterAdmin().prepareState(TEST_REQUEST_TIMEOUT).get().getState(); | ||
| assertThat(state.nodes().size(), equalTo(2 + numDataNodes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... except for this new assertion, which I think we should revert. This requires us to wait for the old master to be removed, and that shouldn't be necessary to make the more important assertion that the elected master is not voting-only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I removed this.
|
|
||
| // start a fresh full master node, which will be brought into the cluster as master by the voting-only nodes | ||
| final String newMaster = internalCluster().startNode(); | ||
| internalCluster().validateClusterFormed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should revert this too, it's stronger than needed. InternalTestCluster#getMasterName already waits for the new master to be elected, and that's enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my above reply applies to here. Thanks!
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
This has been a good case for me to dig a bit deeper into the cluster coordination area. Thanks a lot for the review and discussion! |
The original cluster should be properly formed before the tests kick off. This PR ensures that. Relates: elastic#129118 Resolves: elastic#130883 Resolves: elastic#130979
The original cluster should be properly formed before the tests kick off. This PR ensures that. Relates: elastic#129118 Resolves: elastic#130883 Resolves: elastic#130979
The original cluster should be properly formed before the tests kick off. This PR ensures that.
Relates: #129118
Resolves: #130883
Resolves: #130979